Domain shift widely exists in the visual world, while modern deep neural networks commonly suffer from severe performance degradation under domain shift due to the poor generalization ability, which limits the real-world applications. The domain shift mainly lies in the limited source environmental variations and the large distribution gap between source and unseen target data. To this end, we propose a unified framework, Style-HAllucinated Dual consistEncy learning (SHADE), to handle such domain shift in various visual tasks. Specifically, SHADE is constructed based on two consistency constraints, Style Consistency (SC) and Retrospection Consistency (RC). SC enriches the source situations and encourages the model to learn consistent representation across style-diversified samples. RC leverages general visual knowledge to prevent the model from overfitting to source data and thus largely keeps the representation consistent between the source and general visual models. Furthermore, we present a novel style hallucination module (SHM) to generate style-diversified samples that are essential to consistency learning. SHM selects basis styles from the source distribution, enabling the model to dynamically generate diverse and realistic samples during training. Extensive experiments demonstrate that our versatile SHADE can significantly enhance the generalization in various visual recognition tasks, including image classification, semantic segmentation and object detection, with different models, i.e., ConvNets and Transformer.
translated by 谷歌翻译
Semantic segmentation in 3D indoor scenes has achieved remarkable performance under the supervision of large-scale annotated data. However, previous works rely on the assumption that the training and testing data are of the same distribution, which may suffer from performance degradation when evaluated on the out-of-distribution scenes. To alleviate the annotation cost and the performance degradation, this paper introduces the synthetic-to-real domain generalization setting to this task. Specifically, the domain gap between synthetic and real-world point cloud data mainly lies in the different layouts and point patterns. To address these problems, we first propose a clustering instance mix (CINMix) augmentation technique to diversify the layouts of the source data. In addition, we augment the point patterns of the source data and introduce non-parametric multi-prototypes to ameliorate the intra-class variance enlarged by the augmented point patterns. The multi-prototypes can model the intra-class variance and rectify the global classifier in both training and inference stages. Experiments on the synthetic-to-real benchmark demonstrate that both CINMix and multi-prototypes can narrow the distribution gap and thus improve the generalization ability on real-world datasets.
translated by 谷歌翻译
背景:基于其可变的历史视觉记录,对青少年的球形等效物进行定量预测。方法:从2019年10月到2022年3月,我们检查了来自中国成都成都6-20岁的37,586名青少年的双眼未校正视力,轴向长度,角膜曲率和轴向75,172眼。 80 \%样品由训练集和剩余的20 \%组成测试集。时间感知的长期短期记忆被用来定量预测青少年在两年半内的球形当量。结果:球形当量的测试集的平均绝对预测误差为0.273-0.257,如果我们考虑不同的历史记录和不同的预测持续时间,则从0.189-0.160到0.596-0.473。结论:时间感知时间长的短期记忆被应用于不规则采样时间序列中的时间特征,这更符合实际数据的特征,因此具有更高的适用性,并有助于较早地识别近视的进展。总体误差0.273远小于临床上可接受预测的标准,例如0.75。
translated by 谷歌翻译
由于基于相交的联盟(IOU)优化维持最终IOU预测度量和损失的一致性,因此它已被广泛用于单级2D对象检测器的回归和分类分支。最近,几种3D对象检测方法采用了基于IOU的优化,并用3D iou直接替换了2D iou。但是,由于复杂的实施和效率低下的向后操作,3D中的这种直接计算非常昂贵。此外,基于3D IOU的优化是优化的,因为它对旋转很敏感,因此可能导致训练不稳定性和检测性能恶化。在本文中,我们提出了一种新型的旋转旋转iou(RDIOU)方法,该方法可以减轻旋转敏感性问题,并在训练阶段与3D IOU相比产生更有效的优化目标。具体而言,我们的RDIOU通过将旋转变量解耦为独立术语,但保留3D iou的几何形状来简化回归参数的复杂相互作用。通过将RDIOU纳入回归和分类分支,鼓励网络学习更精确的边界框,并同时克服分类和回归之间的错位问题。基准Kitti和Waymo开放数据集的广泛实验验证我们的RDIOU方法可以为单阶段3D对象检测带来实质性改进。
translated by 谷歌翻译
通过以人为本的研究(HCR),我们可以引导研究活动,以便研究结果对人类利益相关者(例如最终用户)有益。但是,是什么使研究以人为中心为中心?我们通过提供工作定义来解决这个问题,并定义如何将研究管道分为不同的阶段,在这些阶段中可以添加以人为中心的组件。此外,我们使用HCR组件讨论了现有的NLP,并定义了一系列的指导问题,这些问题可以作为有兴趣探索以人为中心的研究方法的研究人员的起点。我们希望这项工作能够激发研究人员完善所提出的定义,并提出其他对实现HCR有意义的问题。
translated by 谷歌翻译
在本文中,我们研究了合成到现实域的广义语义分割的任务,该任务旨在学习一个仅使用合成数据的现实场景的强大模型。合成数据和现实世界数据之间的大域移动,包括有限的源环境变化以及合成和现实世界数据之间的较大分布差距,极大地阻碍了看不见的现实现实场景中的模型性能。在这项工作中,我们建议使用样式挂钩的双重一致性学习(Shad)框架来处理此类域转移。具体而言,阴影是基于两个一致性约束,样式一致性(SC)和回顾一致性(RC)构建的。 SC丰富了来源情况,并鼓励模型在样式多样化样本中学习一致的表示。 RC利用现实世界的知识来防止模型过度拟合到合成数据,因此在很大程度上使综合模型和现实世界模型之间的表示一致。此外,我们提出了一个新颖的样式幻觉模块(SHM),以生成对一致性学习至关重要的样式变化样本。 SHM从源分布中选择基本样式,使模型能够在训练过程中动态生成多样化和现实的样本。实验表明,我们的阴影在单个和多源设置上的三个现实世界数据集的平均MIOU的平均MIOU的平均MIOU的平均水平分别优于最先进的方法,并优于最先进的方法。
translated by 谷歌翻译
基于深度学习的方法在3D对象检测任务中显示出显着性能。然而,当在逐步学习新类时,它们遭受了最初训练的课程的灾难性表现下降,而无需重新审视旧数据。这种“灾难性忘记”现象阻碍了现实世界场景中的3D对象检测方法的部署,其中需要连续学习系统。在本文中,我们研究了未开发的但重要的类增量3D对象检测问题,并提出了第一种解决方案 - SDCOT,一种新型静态动态共同教学方法。我们的SDCOT通过静态教师减轻了灾难性的旧课程,这为新样本中的旧课程提供了伪注释,并通过用蒸馏损失提取先前的知识来规范电流模型。与此同时,SDCOT一致地通过动态教师从新数据中了解基础知识。我们对两个基准数据集进行了广泛的实验,并在几个增量学习场景中展示了我们SDCOT对基线方法的卓越性能。
translated by 谷歌翻译
由于许多信息,用户很难找到它们在许多选择中感兴趣的内容。为了提高用户的经验,推荐系统已广泛用于音乐推荐,电影建议,网上购物和其他场景。最近,知识图(KG)已被证明是提高推荐系统性能的有效工具。但是,在应用知识图表中提出建议的巨大挑战是如何使用知识图来获取更好的用户代码和项目代码。为了响应这个问题,本研究提出了一种基于知识图(URIR)的用户经常性神经网络(RNN)编码器和项目编码器推荐算法。该研究通过捕获高级邻居信息来生成项目的表示向量,并应用RNN和项目的表示向量来编码用户以生成用户的表示向量,然后对用户的表示向量和项目执行内部产品操作。表示向量获得用户与项目互动的概率。三个真实数据集上的数值实验表明,URIR对诸如AUC,精密,召回和MRR等指标中的最先进算法的卓越性能。这意味着URIR可以有效地使用知识图来获得更好的用户代码和项目代码,从而获得更好的推荐结果。
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译
With the development of technology and sharing economy, Airbnb as a famous short-term rental platform, has become the first choice for many young people to select. The issue of Airbnb's pricing has always been a problem worth studying. While the previous studies achieve promising results, there are exists deficiencies to solve. Such as, (1) the feature attributes of rental are not rich enough; (2) the research on rental text information is not deep enough; (3) there are few studies on predicting the rental price combined with the point of interest(POI) around the house. To address the above challenges, we proposes a multi-source information embedding(MSIE) model to predict the rental price of Airbnb. Specifically, we first selects the statistical feature to embed the original rental data. Secondly, we generates the word feature vector and emotional score combination of three different text information to form the text feature embedding. Thirdly, we uses the points of interest(POI) around the rental house information generates a variety of spatial network graphs, and learns the embedding of the network to obtain the spatial feature embedding. Finally, this paper combines the three modules into multi source rental representations, and uses the constructed fully connected neural network to predict the price. The analysis of the experimental results shows the effectiveness of our proposed model.
translated by 谷歌翻译